Abstract: The elastic net is a popular and successful method in statistics and machine learning for regularized regression, with applications to linear regression, data analysis and feature selection. It linearly combines L1 and L2 penalties and overcomes limitations of using either only L1 or only L2 regularization. An important unresolved challenge is to set the regularization coefficients (i.e. weights of the L1 and L2 penalties) with general provable guarantees.
We consider the problem of tuning the regularization parameters of the elastic net across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization. We obtain a novel structural result which characterizes the loss as a function of the tuning parameters as a piecewise-rational function with algebraic boundaries. We use this to bound the structural complexity of the regularized loss functions and show generalization guarantees for tuning the regression coefficients in the statistical setting. We also consider the more challenging online learning setting, where we show vanishing average expected regret relative to the optimal parameter pair. We further extend our results to tuning classification algorithms obtained by thresholding regularized regression fits. Our results are the first general learning-theoretic guarantees for this important class of problems that avoid strong assumptions on the data distribution.
Based on joint work with Nina Balcan, Misha Khodak and Ameet Talwalkar. To appear in NeurIPS 2022.